Audio network distribution system

ABSTRACT

An audio distribution network system ( 20 ) allowing an audio distribution system to be created that is integrated with the home automation system into a home network that permits vocal feedback, status and even control with the audio through network speakers ( 100 ).

BACKGROUND OF THE INVENTION

Currently, most audio speakers are passive devices that receive ananalog or digital audio signal. A few advanced models have limitedself-diagnostics that can be communicated out over additional wire runsas well. These speakers are usually wired to racks or source switchingpre-amps and amplifiers. The problem with this approach is that thesesystems are not very flexible. It is hard to expand the audio sourcesthat can be heard through the speakers embedded in walls or other placesafter the system has been installed without buying and installingadditional costly components. Other audio sources include as homecontrol system voice communication, intercom audio, soundtracks forCD-ROM games, solid-state sound memories. Digital audio broadcastingsystems, and even Internet audio can not easily be added and routedthrough to the existing speakers at a future date if the existing systemwas not originally designed to input and handle it. This is mostly dueto the ongoing proliferation of new audio compression formats.High-quality digital audio data takes a lot of hard disk space to store(or channel bandwidth to transmit). Because of this many companies haveworked on compressing and or coding of the bit stream to allow for asmaller binary footprint. This allows for high quality music to take upless storage space and to be transported across vast networks with asmaller amount of data, and therefore less bandwidth. However, these newcompression and encoding formats require that un-compression anddecoding be performed to reconstitute the original audio before it isplayed out the loudspeaker. If an existing audio system is limited toreconstituting only audio formats known at the time of installation, theaudio system quickly becomes obsolete.

Many new products have wireless network capabilities, but still cannotbe easily connected into a home network, because of a lack of easilyaccessible wireless to wired network bridging within range of thedevice. This can especially be a problem if the wireless device is ahandheld mobile unit such as a PDA, and due to a lack of access points,can not communicate from all rooms in the house.

The current approach to controlling audio and doing home automation isoften cumbersome. The sound system remote that allows the room audiolevel to be adjusted does not allow the room lights to be dimmed.Therefore, different remote controllers for each function are needed.Nor do users like the “wall clutter” created by putting separatemultiple audio and other home network control units in the walls.Wireless solutions to this problem such as Radio Frequency, known as RF,or Infra-Red, called IR, have limitations. The biggest limitation for RFis that in many large cites, the RF noise background is very high,creating communication problems, and there may be health concerns withexcessive RF. The IR limitation is that IR is effective in “line ofsight” only, and the home automation devices to be controlled may be inother rooms. These problems are compounded in retrofit situations wheredie minimal changes that affect the current building and existingsystems are desired.

It is therefore the object of this invention to provide a networkedspeaker, so that an audio distribution system can be created that isintegrated with the home automation system into a home network thatpermits vocal feedback, status, and even control with the audio throughthe network speakers. The network should let the user know what ishappening, and provide very intuitive instruction on how to use thesystem. This will enable the audio speakers to easily adjust to andallow new audio sources and to become wireless access points in thehome, or provide the wireless bridge to the hard-wired network.

DESCRIPTION OF THE DRAWINGS

These and other objects and features of the invention will become moreapparent upon a perusal of the following description taken inconjunction with the accompanying drawings wherein:

FIG. 1 is a circuit diagram of an audio distribution system;

FIG. 2 is a circuit diagram of a network speaker embodiment of thesystem shown in FIG. 1;

FIG. 3 is a circuit diagram of another network speaker embodiment;

FIG. 4 is a circuit diagram of another network speaker embodiment;

FIG. 5 is a circuit diagram of another network speaker embodiment;

FIG. 6 is a circuit diagram of another network speaker embodiment; and

FIG. 7 is a circuit diagram of a CODEC circuit for use in the networkspeaker embodiments of FIGS. 2-6;

FIG. 8 is a circuit diagram of a Legacy Audio Converter/Controller foruse in the system shown in FIG. 1;

FIG. 9 is a circuit diagram of a network speaker including poweroptions; and

FIG. 10 is a network speaker including battery powered options and anenergy storage module.

DETAILED DESCRIPTION OF THE PREFERED EMBODIMENT

An audio distribution network system 20 (FIG. 1) includes a plurality ofspeaker node units 100 which are coupled to a Transport ControlProtocol/Internet Protocol (TCP/IP) based network backbone 200. Alsocoupled to the network backbone 200 are networked audio source nodedevices 300, an Internet service interface 400, and a Legacyconverter/controller 600. Legacy sources 500 provide analog or digitallinear PCM_(Pulse Coded Modulation)_audio to be converted into a packetswitched digital_coding for transport across the network. They will alsoprovide analog video which will be used for control status feedback, aswell as conversion to a packet switched_digital coding for transportacross the network. In addition, the Legacy sources 500_will alsoreceive IR or serial commands from the converter/controller 600 whichalso communicates with a Legacy home control network 700. Some legacysources_500 may also provide serial communications to theconverter/controller 600.

The source devices 300 can consist of any number of networked digitalaudio source devices (music playback devices) such as personal computersor audio servers that are able to communicate with one another over theshared TCP/IP network 200 and have the resources to serve digital audiofiles (WMA, MP3, Corona, etc.) to the network. Bit streamed audio(digital music, in the form of binary data that is sent in packets) fromthe Internet also may enter the system 20 from the Internet interface400. The Legacy audio devices 500 (existing analog audio equipment, i.e.CD players, tape decks, VCR's) have their audio converted into a packetswitched digital_network format (WMA, MP3, Corona) by the LegacyConverter 600 or by the network speakers 100. The network speaker 100can also real time encode sound received from its internal microphone orfrom reversing the transduction circuit from the speaker to perform theact of capturing sound waves present in the room, and then coding thatsound and providing it for use on the network 20, including by use ofdifferential masking for control purposes. Any new device that is ableto send audio out on the network can serve as the audio source for anetwork speaker 100 as long as the network speaker 100 understands theaudio format. Control commands that affect the audio distribution cancome from the network server 300, the Internet interface 400, the legacyhome control network 700 via the legacy converter/controller 500, orfrom other network speakers 100.

The system 20 is a collection of independent computers or otherintelligent devices that communicate with one another over the sharedTCP/IP network 200. For example, the system 20 can be part of theInternet linked networks that are worldwide in scope and facilitate datacommunication services such as remote login, file transfer, electronicmail, the World Wide Web and newsgroups, or for security reasons part ofa home intranet network utilizing Internet-type tools, but availableonly within that home. The home intranet is usually connected to theInternet via an Internet interface 400. Intranets are often referred toas LANs (Local Area Networks).

The home network backbone 200 communicates using the TCP/IP networkprotocol consisting of standards that allow network members tocommunicate. A protocol defines how computers and other intelligentdevices will identify one another on a network, the form that the datashould take in transit, and how this information is processed once itreaches its final destination. Protocols also define procedures forhandling lost or damaged transmissions or “packets”. The TCP/IP networkprotocol is made up of layers of protocols, each building on theprotocol layers below it. The basic layer is the physical layer protocolthat defines how the data is physically sent through the physicalcommunication medium, such as Thickwire, thin coax, unshielded twistedpair, fiber optic, telephone cable, fiber optic cable, RF, IR, powerline wires, etc. Those physical media requiring an actual physicalconnection of some type, such as Thickwire, thin coax, unshieldedtwisted pair, fiber optic, power line, telephone cable, or fiber opticcable, to the network device are called wired media Those physical medianot requiring an actual physical wire connection of any type to thenetwork device, such as RF and IR, are called wireless media. A TCP/IPhome network can be totally wired, totally wireless, or a mix ofwireless and wired. A TCP/IP home network is not limited to a singlephysical communication medium. Different physical communication mediacan be connected together by bridging components to create a unifiedcommunication network. Each network physical media has its physicallayer protocol that defines the form that the data should take intransit on that particular physical media. The bridging componentenables the transfer and conversion of communication on one physicalmedium and its physical layer protocol to a different physical media andits physical layer protocol. Bridging components also may provide aproxy from one network to the other, this will be common among UpnP_V1to V2, and with Ipv6 to Ipv4 (Internet Protocol version 6, 4). Commonphysical layer LAN technology in use today include Ethernet, Token Ring,Fast Ethernet, Fiber Distributed Data Interface (FDDI), AsynchronousTransfer Mode (ATM) and LocalTalk. Physical layer protocols that arevery similar over slightly different physical media are sometimesreferred to be the same name but of different type. An example are thethree common types of Fast Ethernet: 100 BASE-TX for use with level 5UTP cable, 100BASE-FX for use with fiber-optic cable, and 100BASE-T4which utilizes an extra two wires for use with level 3 UTP cable. TheTCP/IP protocol layers are well known and will not be further describedin greater detail.

The system 20 may have any number of networked self-sufficient digitalaudio source devices 300 in it, such as a digital music storage device,PC, music player, personal Digital Assistant (PDA), on board automobilemusic system, digital integrated audio equipment, personal digitalrecorder or video digital recorder. Networked audio source devices 300can provide digital audio files such as WMA, MP3, “Corona”, and MLP fromits hard disk, internal flash, or an audio input such as a microphone orCD reader or music player. Also, the networked audio source devices 300can encompass a specialized network server, usually a specialized,network-based hardware device designed to perform a single orspecialized set of server functions. It is usually characterized by aminimal operating architecture, and client access that is independent ofany operating system or proprietary protocol. Print servers, terminalservers, audio servers, control remote access servers and network timeservers are examples of server devices which are specialized forparticular functions. Often these types of servers have uniqueconfiguration attributes in hardware or software that help them toperform best in their particular arena. While specialized hardwaredevices are often used to perform these functions in large systems, thespecialized functions served by the network server could be performed bya more general use computer. A single computer,_(sometimes referred toas a RISC_(reduction instruction set computer), called a web server,could combine the functionality of the networked audio source devices300 and the Internet interface 400. If no connection to the Internet isdesired, the Internet interface 400 function can be removed from thesystem without loss of intranet network integrity. Network and webservers are well known and will not be described in greater detail.

The legacy home control network 700 is an existing network of devices inthe home used to automate and control the home. If the legacy homecontrol network 700 can not communicate over a shared TCP/IP network200, it cannot directly control or be controlled by the networkspeakers, and the two dissimilar networks must be bridged by a LegacyConverter/Controller 600. Any legacy home control network 700 that cancommunicate within the system 20 over a shared TCP/IP network could becombined into the home network backbone 200 and then the legacy homecontrol network 700 device would have access to and be able to controlthe network speaker 100 if it has the resources and instructions to doso. The Legacy Converter/Controller 600 can also be used to providenetwork access to un-networked legacy devices that are able to serve ascommand and control interfaces such as the telephone, cell phone, RFremote, IR remote, direct voice controller or keypad. A networked audiosource 300 such as a PDA, also can act as the legacyconverter/controller for a legacy device such as an attached cell phone.

The legacy home audio sources 500 are other audio sources that are notable to communicate over a shared TCP/IP network 200, such as analogaudio players, CD players, video game players, tape players, telephone,VCRs or other audio sources that are not able to communicate over ashared TCP/IP network 200. The legacy Converter/Controller 600 takes theanalog or digital linear PCM_audio from the Legacy home sources 500,converts it into an acceptable digital format or formats if needed, andserves the audio as needed over the shared TCP/IP home network backbone200. If the legacy home audio source 500 provides an analog audio to theLegacy Converter/Controller 600, the Legacy Converter/Controller 600must convert the analog audio into an appropriate digital audio formatbefore serving it to the network. The Legacy Converter/Controller 600can also convert commands sent from the home network 200 to the legacyhome source 500 into a command format that is understood by the legacyhome source 500, such as serial, RF or IR commands. A system may havemultiple Legacy Converter/Controllers 600 for each legacy home source500 or legacy home control network 700, or a Legacy Converter/Controller600 may convert and control more than one legacy home source 500 ormultiple Legacy home control networks 700.

Illustrated in FIG. 2 is one network speaker embodiment 100A. A networkinterface 110 couples the network backbone 200 of the system 20 (FIG. 1)to a network controller 120 which feeds a digital to analog converter(DAC) 122 via an audio format converter 121. Receiving an output fromthe DAC 122 is a pre-amplifier 123 which also receives inputs fromspeaker sensors 124. An amplifier 125 receives the output of the pre-amp123 and feeds a speaker driver 126 coupled to speaker components 127.

The network speakers 100A may be enclosed in a case or box, in a ceilingembedded in or behind a wall, or in a car and constitute the mostprevalent enabling components in the system 20. Each network speaker100A communicates to the network backbone (Ethernet) 200 through thenetwork interface 110 that handles the physical layer hardware protocol.The network interface 110 may connect to one or more physical layers,wired or unwired or both. From there the Network Speaker Controller 120provides the intelligence to run the various application features of thenetwork speaker, including the higher levels of the TCP/IP protocol.Audio sources (Digital Music content) received from the network andaddressed to a particular network speaker 100A are sent to the audioformat converter 121 that converts the source digital audio format intoa form ready to be converted to analog. The correctly re-formatteddigital signal is sent to the digital to DAC 122 to be converted fromdigital to analog. The analog signal then goes to a pre-amp 123 wherethe signal is adjusted and filtered. Included in the pre-amp 123 can bean active crossover which operates at preamp level to limit thefrequencies to the amplifier or amplifiers connected to it. The speakercomponents connected to these pre-amplifiers would therefore receive alimited frequency range, and can be optimized for the frequenciesreceived. The pre-amp signal then goes to the amplifier section 125, andthe amplified signal proceeds to the speaker driver 126 and out thespeaker microphone components 127 to become audio sound waves. Becausethe application software in the Network Speaker controller 120 and audioformat converter 121 can be updated over the network and with the use ofsufficient processing power, and presence of ample memory, the networkspeaker 100A can be made to play currently unknown digital formats inthe future. The audio format converter 121 may have the DAC 122 builtin. The speaker sensors 124 which may include temperature, SPL, ambientand noise floor, pressure, and voltage sensors provide the on boardapplication speaker feedback which enable internal auto adjustment toenhance speaker protection and performance and allow for sending controlsignals back to other devices which may need/want the statusinformation. A very useful application for this would be for the use ofdifferential masking. This is a process in which you are comparingsamples from the digital source against the real time encoding samplesfrom within the air space. The original digital source is thensubtracted from the combined real time_encoding and the result is a newsample.

The network interface 110 connects the network speaker 100A to theactual network backbone 200 and will vary depending on the physicalmedia and physical layer protocol used. Network interface cards,commonly referred to as NICs, are often used to connect PCs to a wirednetwork, and are used in the preferred embodiment when the networkbackbone is some form of wired cable or fiber optics. The NIC provides aphysical connection between the networking cable and the computer'sinternal bus. Different computers have different bus architectures; themost common are PCO found on 486/Pentium PCs and ISA expansion slotscommonly found on 386 and older PCs. NICs come in three basic varieties:8-bit, 16-bit, and 32-bit. The larger the number of bits that can betransferred to the NIC, the faster the NIC can transfer data to thenetwork cable. Many NIC adapters comply with Plug-n-Play specifications.On these systems, NICs are automatically configured without userintervention, while on non-Plug-n-Play systems, configuration is donemanually through a setup program and/or DIP switches. Cards areavailable to support almost all networking standards, including thelatest Fast Ethernet environment. Fast Ethernet NICs are often 10/100capable, and will automatically set to the appropriate speed. Fullduplex networking is another option, where a dedicated connection to aswitch allows a NIC to operate at twice the speed. NIC cards withmultiple terminations capable of supporting multiple physical layerprotocols or within protocol types are to be preferred. Within the NICcards are transceivers used to connect nodes to the various Ethernetmedia. Most computers and network interface cards contain a built-in10BASE-T or 10BASE2 transceiver, allowing them to be connected directlyto Ethernet without requiring an external transceiver. Many Ethernetdevices provide an AUI connector to allow the user to connect to anymedia type via an external transceiver. The AUI connector consists of a15-pin D-shell type connector, female on the computer side, male on thetransceiver side Thickwire (10BASE5) cables also use transceivers toallow connections. For Fast Ethernet networks, a new interface calledthe MIII (Media Independent Interface) was developed to offer a flexibleway to support 100 Mbps connections. The MII is a popular way to connect100BASE-FX links to copper-based Fast Ethernet devices. Wirelessbackbone physical layer network connections are made using RF networkreceivers made by companies such as Linksys, Cisco, IBM, DLINK andothers, using wireless protocols such as 802.11X, UWB (ultra_wideband),Bluetooth, and more as the network interface 101.

The network speaker controller 120 is an embedded controller with flashmemory programmed to function as a web server. The network speakercontroller 120 and the audio format converter 121 are enable to allowtheir application programming to be updated over the network, thenetwork speaker can be made to play currently unknown digital formats inthe future. The audio sources received from the network most likely willbe in an encoded and/or compressed format. Digital audio coding ordigital audio compression is the art of minimizing storage space (orchannel bandwidth) requirements for audio data. Modern perceptual audiocoding protocols, synonymously called digital audio compressiontechniques, like MPEG Layer-III or MPEG-2 AAC, ATRACK3, WMA, Ogg Vorbis,or “Corona”, and even a packet switched Dolby Digital (AC3 overIpv6),exploit the properties of the human ear (the perception of sound)to achieve a respectable size reduction with lithe or no perceptibleloss of quality. This compression is usually more than just reducing thesampling rate and the resolution of your samples. Basically; this isrealized by perceptual coding techniques addressing the perception ofsound waves by the human ear, which remove the redundant and irrelevantparts of the sound signal. The sensitivity of the human auditory systemsfor audio signals varies in the frequency domain being high forfrequencies between 2.5 and 5 kHz and decreasing beyond and below thatfrequency band. The sensitivity is represented by the Threshold In Quietso that any tone below the threshold will not be perceived. The mostimportant psychoacoustics fact is the masking effect of spectral soundelements in an audio signal like tones and noise. For every tone in theaudio signal a masking threshold can be calculated. If another tone liesbelow this masking threshold, it will be masked by the louder tone andremains inaudible, too. These inaudible elements of an audio signal areirrelevant for the human perception and thus can be eliminated by theencoder. The result after encoding and decoding is different from theoriginal, but it will sound more or less the same to the human ear. Howclosely it would sound to the original depends on how much compressionhad been performed on it.

Audio compression really consists of two parts. The first part, calledencoding, transforms the digital audio data that resides, say, in a WAVEfile, into a highly compressed form called bitstream (or coded audiodata). To play the bitstream on your soundcard, you need the secondpart, called decoding. Decoding takes the bitstream and reconstructs itto a WAVE file. Highest coding efficiency is achieved with algorithmsexploiting signal redundancies and irrelevancies in the frequency domainbased on a model of the human auditory system. Current coders use thesame basic structure. The coding scheme can be described as “perceptualnoise shaping” or “perceptual sub-band/transform coding”. The encoderanalyzes the spectral components of the audio signal by calculating afilterbank (transform) and applies a psychoacoustics model to estimatethe just noticeable noise-level. In its quantization and coding stage,the encoder tries to allocate the available number of data bits in a wayto meet both the bit rate and masking requirements. The decoder is muchless complex. Its only task is to synthesize an audio signal out of thecoded spectral components.

The term psychoacoustics describes the characteristics of the humanauditory system on which modern audio coding technology is basedproviding audio quality of a coded and decoded audio signal the qualityof the psychoacoustics model used by an audio encoder is of primeimportance. Audio data decompression and de-coding of audio formats intothe audio format acceptable the loudspeaker is performed by the audioformat converter 121. This audio format conversion of different formatsallows high quality low bit-rate applications, like soundtracks forCD-ROM game, solid-state sound memories, Internet audio, or digitalaudio broadcasting systems to all be played over the same speaker. Theaudio format converter 121 function in the current embodiment of thenetworked speaker will be performed by an audio coding and decoding chipset (CODEC). CODEC hardware and or software is currently available fromsuch companies as Micronas, Sigmatel, TI, Cirrus, Motorola,_Fraunhofer,and Microsoft. CODECs handle the many current encoding protocols such asWMA, MPEG-2 AAC, MP3(MPEG Layer III), MPSPro,_G2, ATRACK3, MP3PRO,“Corona”, (WMAPro)Ogg-Vorbis and others. To best perform the audioformat conversion function, the CODEC should be designed to handle alltypes of audio content, from speech-only audio recorded with a lowsampling rate to high-quality stereo music. The CODEC should be veryresistant to degradation due to packet loss, and have an efficientencoding algorithms to perform fast encodes and decodes, and to minimizethe size of the compressed audio files, and still produce quality soundwhen they are decoded. In addition, the CODEC should be highly scalableand provide high-quality mono or stereo audio content over a wide rangeof bandwidths, to allow selection of the best combination of bandwidthand sampling rate for the particular content being played or recorded.Content encoded at 192 Kbps by the CODEC should be virtuallyindistinguishable to a human ear from content originating on a compactdisc. This extremely high-quality content is called CD transparency.Shown in FIG. 6 is a circuit diagram of a CODEC circuit that could beused to implement the audio format converter 121 function of the networkspeaker. A preferred embodiment of this invention uses the Windows MediaAudio (WMA) Audio CODEC by Microsoft. The audio format converter 121function could also be performed by a decoder chip with no encoderfunctionality if no digital audio reformatting or digital encoding ofanalog audio is desired.

The digital to analog converter 122, converts a digital input into ananalog level output. At the pre-amp 123, the analog signal is adjustedand filtered, and any desired active or electronic crossover may beperformed. An electric crossover is a powered electronic circuit whichlimits or divides frequencies. Most electronic crossovers have outputcontrols for each individual channel. This allows you to set the gainsfor all amplifiers at one convenient location, as well as the ability tolevel match a system. Some crossovers will allow you to set the low andhigh pass filters separately, which allows you to tune out acousticpeaks or valleys at or near the crossover frequencies. One of theadvantages of electronic crossovers is that there is little or noinsertion loss. Passive crossovers reduce the amplifier power slightly,due to their resistance. Another advantage of electronic crossovers isthe ability to separate low frequencies into their own exclusiveamplifier, which reduces distortion heard at high volumes in the highfrequency speakers. Amplification of low frequencies requires greaterpower than higher frequencies. When an amplifier is at or near peakoutput, clipping may occur, which is able to destroy tweeters and otherspeakers with small voice coils. A separate low frequency amplifierallows the total system to play louder and with lower distortion. Anadjustable crossover allows the user to make crossover changes easilyand to immediately hear the effect of the changes. Changing the filters,or crossover points, lets users adjust the audio to meet theirpreferences. The electronic crossover, by limiting the frequencies tothe amplifier or amplifiers connected to it, also ensures that thespeakers which are connected to these amplifier(s) would thereforereceive a limited frequency range, and these speakers can be optimizedfor the frequencies received. It also enables personal preferences infrequency range pre-amplification adjustment. The advantages of usingactive filters are that they are built onto the pre-amp circuit board.Changing the filters (or crossover points) is usually accomplishedthrough external dial turning, by changing frequency modules with aswitch or by changing crossovers if fixed types are used. An adjustablecrossover is preferred.

The amplifier 125 is comprised of one or more amplifier circuits thatamplify the audio signal to the desired final signal strength. Usingmultiple amplifiers takes advantage of the crossover frequency filteringto optimize the amplifier for the frequency range received. Amplifiersusing the latest in digital amplifier technology that can efficientlyproduce large amounts of power with a much smaller heat sink than inpast designs are preferable, and this also will eliminate the need foranother_DAC. The speaker driver 126 is comprised of one or more speakerdrivers circuits. Using multiple drivers for multiple speakers allowsthe speakers to be optimized for the frequency range received. Thespeaker components 127 convert the signal to sound and are voiced anddesigned to handle a wide dynamic range of audio frequencies and areable to aid in the accurate reproduction of sound from a digital source.

FIG. 3 shows another network speaker embodiment 100B. The speakerembodiment 100B includes all of the components of the speaker embodiment100A and identical components bear the same reference numerals. Inaddition, speaker embodiment includes an analog to digital converter(ADC) 128 and a modified speaker/microphone driver 126 b. The SpeakerDriver 126 b circuitry is expanded to serve as both an output driver anda microphone input for half duplex operation (one way conversations),and an internal microphone can implement a full duplex operation(simultaneous two way conversations). The microphone input is sent tothe pre-amp 123 for signal adjustment and filtering. From there it issent to the analog to digital converter 128 to convert the analog signalto a simple digital format. The audio format converter 121 then takesthe digital microphone input and compresses and encodes it into adesired format for distribution. The encoded format of which may vary,depending on the application is sent to the network controller 120where, depending on the software application and programming, its finaldestination and function are determined. The input may be stored locallyfor future audio feedback, used locally, or it may be sent out to thenetwork through the network interface 110. The input could be used witha voice recognition application to initiate spoken audio or home controlcommands. Speaker sensors 124 feedback received by the pre-amp 123 canalso be sent to the ADC 128 to be converted from analog to digitalformat, and then passed on to the network controller 120. Depending onthe network controller 120 applications, the feedback can then be sentout the network interface 110 onto the network backbone 200 as an alarmor other condition if desired. Additional features in the audio formatconverter 121 in conjunction with application software could enable theability to change audio setting(s) based on the type of music that isbeing played, or even the user playing it, or Time of Day (TOD). Thenetwork speaker 100B may have the ability, through the audio formatconverter 121 or other circuitry, to support headphones.

FIG. 4 depicts another network speaker embodiment 100C with wirelessremote control access. All components of speaker embodiment 100B arepresent in speaker embodiment 100C and bear the same reference numerals.In addition, additional components provide wireless remote control fromIR and RF remotes. It should be noted that the additional componentscould have been added to the network speaker embodiment 100A as well. Aninternal IR sensor 131 senses IR from one or more external IR remotes170. The sensed IR is sent to an IR receiver 130 that processed the IRinput, and the processed IR input is sent to the network controller 120which then performs commands as per its application software. Ifdesired, the IR sensor 131 may be external of the speaker 100C whichthen can be installed behind a wall as wall speakers, and still receiveIR. The network controller 120 can send the processed IR commands outthe network interface 110 onto the network to be processed remotely bythe Legacy Converter/Controller 600 which then translates them intocommands to the legacy sources 500. Alternatively, the networkcontroller 120 can send the processed IR commands out the networkinterface 110 onto the network to be processed remotely by the legacyConverter/Controller 60 which then translates them into Legacy homecontrol network 700 commands. In the same manner, RF control access isprovided by a RF Sensor/Transceiver 135 which receives input from RFremotes 175 and other_network speaker transceivers, and transmitsinformation to the network controller 120. While this embodiment 100Cshows both IR and RF access through the same network speaker, it will beappreciated that IR only control access or RF only control access couldbe implemented.

The wireless control access allow IR or RF input to the speakers 100C tobe used to remotely control the system 20 including control of theaudio, (including multi destination sync), video, HVAC, security, roomlight level house scenes, etc., if the system is so programmed. Wherethe software application includes the ability to “learn” new IR commandsand associate them with audio or house control commands, existing legacysources with IR remotes can be integrated into the network controllerthrough the legacy Converter/Controller 600. And because the legacyConverter/Controller 600 is upgradeable over the network, the networkspeaker IR input ability could be made to control currently unknownsystem devices in the future.

FIG. 5 shows another network speaker embodiment 100D that serves asbridge between one or more wireless network devices and a wired segmentof the network 200, known as a wireless access point. This wirelessaccess point embodiment includes the components of embodiment 100B withadditional components added for wireless-wired bridging, such as dualmode ad-hoc to_infrastructure mode. The network 200 consists of at leastone physically wired network section 240 and at least wireless networksegment 250. The network interface 110 consists of two parts, a wirednetwork interface 111 connecting the network speaker 100D to the wirednetwork backbone 240 and an RF network interface 112 connecting thenetwork speaker 100D to the wireless RF network backbone 250. Networkcommunication can pass between the wired backbone 240 and the wirelessRF network backbone 250 via the network speaker 100D. The RF networkinterface 112 consists of an RF receiver/transmitter capable of bothreceiving and sending RF network communication.

FIG. 6 illustrates another speaker embodiment 100E that has wirelesscontrol access and that serves as a wireless access point. This wirelessaccess point embodiment includes all of the components of embodiments of100B, 100C and 100D.

If a home has a network speaker type system, the application softwareopens all kinds of possibilities. New sources or new source content mayenable these intelligent speakers 100 to have more features and playbackformats that are not in existence today, and to adjust to the sourcecontent. An example of this would be the ability to change audiosettings based on the type of music that is being played, or even theuser playing it, or Time of Day (TOC). This will be highly customizablelong past the time of installation, to keep the audio system upgradeablewithout structural changes to the home even if the network speakers areembedded in walls and other not easily accessed locations. In addition,a network speaker 100 with a microphone and the appropriate applicationsoftware could record and route messages digitally to any house networknode or internet node; locate and identify a user in a room, which inturn enables the system 20 to route voice mail and message to the roomthe user is presently in on demand; locate and identify a room user,which in turn enables the system to route voice mail and message to theroom the user is presently in on demand; serve as a voice recognitionand authorization point to enable direct voice control of any node onthe network or any legacy audio source 500 or legacy control network 700device that may be connected to the network 200 through a legacyconverter/controller 600; or to automatically record and/or route voicemessages from one user to the room in which the recipient identified inthe voice message is currently located. Multiple network speakers 100with microphones in one room could even triangulate the location of theuser, which in turn enables the system to optimize the audio for theusers current location.

The network speaker 100 with a sufficient memory and the appropriateapplication software could store voice mail to be played on demand bythe room user or in a totally wireless network 200 serve as a wirelessrepeater within a home if the wireless communication medium signalstrength was insufficient to reach all rooms or areas of the home fromall locations. Also, a strategically placed network speaker 100 servingas a wireless access point allows the communication of audio, data,commands or any other communications from mobile network nodes wheneverthey are within communication range, such as PDAs, mobile controllers,mobile computers, wireless headphones, or network speakers 100 in mobileunits such as automobiles.

A network speaker 100 with IR or RF receivers and the appropriateapplication software would allow wireless remote control, status andfeedback from any IR or RF remote, or other network speakertransceiver,_to any node on the network or any legacy audio source 500or legacy control network 700 device that may be connected to thenetwork 200 through a legacy converter/controller 600. A network speaker100 with a RF receiver capable of transmitting RF could enable wirelessnon-networked headphones. Also, a network speaker 100 could encode andsend and transmit sound and images from a room out on the network, aswell as act as the source point for room control and automation andvoice recognition services for control and automation. In addition, anetwork speaker 100 could _participate in a multi speaker session duringwhich each network speaker 100 could perform as a master or slave mode.A network speaker 100 in the master mode would control and_distributemulti session clocks and this is where they would originate andbe_calculated. The network speaker 100 in the slave mode would receivevia TCP/IP_and/or RF clocking information from the master in a multisession mode.

A network speaker 100 additionally could be an audio source locallywithin the room via internal solid-state memory as well as terrestrialanalog reception (AM/FM/CATV) if components were added to receive andplay back digital and analog terrestrial radio frequencies (AM/FM/CATV).

FIG. 8 depicts a legacy Audio Converter/Controller 600 embodiment, whichincludes many similar components as the Network Speaker 100. The legacyAudio Converter/Controller 600 communicates with the network backbone(Ethernet) 200 through a network interface 610 which handles thephysical layer hardware protocol and may connect to one or more physicallayers, wired or unwired or both. Coupled to the network interface 610is a Network Controller 620 which provides the intelligence to runvarious application features of the legacy Audio Converter/Controller600, including the higher levels of the TCP/IP protocol. The NetworkController 620 controls an audio format converter 621 which converts thelegacy source audio into the desired network digital format fordistribution. Digital audio from legacy sources 500 are transmitteddirectly to the audio format converter 621 to be re-formatted into thedesired digital format. Analog audio from legacy sources 500 are fed toan analog to digital converter (“ADC”) 622, and the resultant digitizedsignal then goes to the audio format converter 621 to be coded into thedesired digital format. The Network Controller 620 takes the properlyformatted digital audio and sends it to the network 200 via the networkinterface 610. Also, the audio format converter 621 may consist ofmultiple encoders to provide multiple conversions of different legacyaudio inputs simultaneously. The Legacy_Converter/Controller 600 usesthe analog video from the legacy source device for_encoding to a packetswitched digital format such as WMAPro “Corona”, and also uses theanalog video inputs for power status and feedback.

The network interface 610 may vary depending on the physical medium andphysical layer protocol used. Network interface cards, commonly referredto as NICs, are often used to connect a PC to a wired network, and areused in the preferred embodiment when the network backbone is some formof wired cable or fiber optics. Such a NIC provides a physicalconnection between the networking cable and the computer's internal bus.Different computers have different bus architectures; the most commonare PCI found on 486/Pentium PCs and ISA expansion slots commonly foundon 386 and older PCs. Typically NICs come in three basic varieties:8-bit, 16-bit, and 32-bit. The larger the number of bits that can betransferred to the NIC, the faster the NIC can transfer data to thenetwork cable. Many NIC adapters comply with Plug-n-Play specifications.On these systems, NICs are automatically configured without userintervention, while on non-Plug-n-Play systems, configuration is donemanually through a setup program and/or DIP switches. Cards areavailable to support almost all networking standards, including thelatest Fast Ethernet environment. Fast Ethernet NICs are often 10/100capable, and will automatically set to the appropriate speed. Fullduplex networking is another option, where a dedicated connection to aswitch allows a NIC to operate at twice the speed. NIC cards withmultiple terminations capable of supporting multiple physical layerprotocols or within protocol types are preferred so that the NIC cardsinclude transceivers used to connect nodes to the various Ethernetmedia. Most computers and network interface cards contain a built-in10BASE-T or 10BASE2 transceiver, allowing them to be connected directlyto Ethernet without requiring an external transceiver. Many Ethernetdevices provide an AUI connector to allow the user to connect to anymedia type via an external transceiver. The AUI connector consists of a15-pin D-shell type connector, female on the computer side, male on thetransceiver side. Thickwire (10BASE5) cables also use transceivers toallow connections. For Fast Ethernet networks, a new interface calledthe MII (Media Independent Interface) was developed to offer a flexibleway to support 100 Mbps connections. The MII is a popular way to connect100BASE-FX links to copper-based Fast Ethernet devices. Wirelessbackbone physical layer network connections are made using RF networkreceivers made by companies such as Linksys, Cicso, IBM, DLINK, andothers, using wireless protocols such as 802.11X, UWB, Bluetooth, andmore as the network interface 610.

The network speaker controller 620 is an embedded controller with flashmemory programmed to function as a web server and enabled with the audioformat converter 621 to allow their application programming to beupdated over the network, the legacy Audio Converter/Controller 600 canbe made to code audio to currently unknown digital formats in thefuture. As in the speaker embodiments described above, the desired audioto be distributed will likely be in a coded and/or compressed format.Digital audio coding or digital audio compression is the art ofminimizing storage space (or channel bandwidth) requirements for audiodata. Modern perceptual audio coding protocols, synonymously calleddigital audio compression techniques, like MPEG Layer-III or MPEG-2 AAC,ATRACK3, G2, WMA, Ogg Vorbis, or WMAPro, “Corona”, exploit theproperties of the human ear (the perception of sound) to achieve arespectable size reduction with little or no perceptible loss ofquality. As described above, this compression, in addition to reducingthe sampling rate and the resolution of the audio samples employeesperceptual coding techniques addressing the perception of sound waves bythe human ear, that remove the redundant and irrelevant parts of thesound signal. The sensitivity of the human auditory systems for audiosignals varies in the frequency domain being high for frequenciesbetween 2.5 and 5 kHz and decreasing beyond and below this frequencyband. The sensitivity is represented by the Threshold In Quiet Any tonebelow this threshold will not be perceived. The most importantpsychoacoustics fact is the masking effect of spectral sound elements inan audio signal like tones and noise. For every tone in the audio signala masking threshold can be calculated. If another tone lies below thismasking threshold, it will be masked by the louder tone and remainsinaudible too. These inaudible elements of an audio signal areirrelevant for the human perception and thus can be eliminated by thecoder. The sound resulting after coding and decoding is different, butwill be perceived more or less the same by the human ear. How closely itwould sound to the original depends on how much compression had beenperformed.

Audio compression actually consists of two parts. The first part, calledcoding or encoding, transforms the digital audio data that resides, say,in a WAVE file, into a highly compressed form called bitstream (or codedaudio data). To play the bitstream on your soundcard, you need thesecond part, called decoding. Decoding takes the bitstream andreconstructs it to a WAVE file. Highest coding efficiency is achievedwith algorithms exploiting signal redundancies and irrelevancies in thefrequency domain based on a model of the human auditory system. Currentcoders use the same basic structure to produce coding that can bedescribed as “perceptual noise shaping” or “perceptualsub-band/transform coding”. The encoder analyzes the spectral componentsof the audio signal by calculating a filterbank (transform) and appliesa psychoacoustics model to estimate the just noticeable noise-level. Inits quantization and coding stage, the encoder tries to allocate theavailable number of data bits in a way to meet both the bit rate andmasking requirements. The decoder is much less complex. Its only task isto synthesize an audio signal out of the coded spectral components.Psychoacoustics describes the characteristics of the human auditorysystem on which modern audio coding technology is based. For the audioquality of a coded and decoded audio signal the quality of thepsychoacoustics model used by an audio encoder is of prime importance.

The audio format converter 621 performs audio data compression andencoding of audio formats into the audio format acceptable fordistribution to the end receiver on the network and can consist of anaudio encoder-decoder chip (CODEC). To best perform the audio formatconversion function, the CODEC should be designed to handle all types ofaudio content, from speech-only audio recorded with a low sampling rateto high-quality stereo music. The CODEC should be very resistant todegradation due to packet loss, and have efficient encoding algorithmsto perform fast encodes and decodes, and to minimize the size of thecompressed audio files, and still produce quality sound when they aredecoded. Also, the CODEC should be highly scalable and providehigh-quality mono or stereo audio content over a wide range ofbandwidths, to allow selection of the best combination of bandwidth andsampling rate for the particular content being played or recorded.Content encoded at 192 Kbps by the CODEC should be virtuallyindistinguishable to a human ear from content originating on a compactdisc. This extremely high-quality content is called CD transparency.

The analog to digital converter 622, commonly referred to as an ADC,converts an analog level input to a digital output. Adding a microphonespeaker input to the ADC will enable voice control of the legacy AudioConverter/Controller 600. It would also enable the legacy AudioConverter/Controller 600 to record audio input for later use as systemmessages or audio feedback. Depending on the software application andprogramming in the network controller 620, the audio input may be storedlocally for future audio feedback, used locally, or it may be fed out tothe network through the network interface 610. The audio input could beused with a voice recognition application to initiate spoken audio orhome control commands.

The Legacy Audio Converter/Controller 600 may also communicate with thelegacy sources 500 using a legacy communication method, such as IR orserial commands, that are understood by the legacy device. The plannedembodiment of the invention will use the fixed set of serial commandsalready understood by the target legacy source. The network controller620 controls and communicates with a legacy controller 624, which alsocommunicates with the legacy source 500 through a legacy audio networkinterface 623. In a preferred embodiment of the invention, a RS-232serial command interface will be used. The functions of the networkcontroller 620 and the legacy controller can be combined into oneembedded controller.

The Legacy Audio converter/Controller 600 may also communicate with thelegacy home control network 700 using the network communication methodunderstood and practiced by the legacy home control network 700 and suchcommunication may vary greatly depending on the legacy home controlnetwork 700 being used. A preferred embodiment of the invention will usethe CEBus powerline protocol for its communication method. The legacycontroller 624 controls and communicates via a legacy home controlnetwork interface 625, with a legacy home control network 700. Thefunctions of the legacy controller in controlling the legacy sources 500and the legacy home control network 700 and the legacy controller couldbe separated out into two separate embedded controllers, or combinedwith the network controller 620. If no legacy source 500 is available,the legacy audio network interface 623 and the legacy source controlfunction of the legacy controller 624 may be eliminated. Similarly, inthe absence of a legacy home control network 700, the legacy homecontrol network interface 625 and the legacy home network controlfunction of the legacy controller 624 may be eliminated.

As illustrated in FIG. 9 network speaker 100F can receive DC currentfrom external regulated power supplys over existing 14-18 AWG speakerwire or can employ PoE (Power over Ethernet) technology to tricklecharge the battery. Also, charge status can be provided for the battery800.

Network speaker 100F has power applied as DC current from a rechargeablebattery source 800 either located within the speaker or inserted intothe speaker as a removable battery pack. This would also allow for linepower status, which would perform a function specific to the applicationonce this condition occurs.

FIG. 10 depicts another speaker embodiment 100G which also can bebattery powered. In addition, the speaker 100G includes an ESM (EnergyStorage Module) which improves audio performance.

Obviously, many modifications and variations of the present inventionare possible in light of the above teachings. It is to be understood,therefore, that the invention can be practiced otherwise than asspecifically described.

1. A networked audio system comprising: a plurality of speaker nodes; aplurality of self-sufficient audio source node devices connected to saidspeaker nodes via an internet protocol network; service interface meansconnected to said speaker nodes via said network; one or more legacysources connected to said speaker nodes via said network; and controland converter means connected between said one or more legacy sourcesand said speaker nodes.
 2. A networked audio system according to claim 1wherein said control and converter means comprises control and encodingnodes.
 3. A networked audio system according to claim 1 wherein eachsaid speaker node renders audio media content and distributes, controls,and serves media content connected to said speaker nodes via saidnetwork.
 4. A networked audio system according to claim 1, wherein eachsaid speaker node comprises a microphone and signal processor means. 5.A networked audio system according to claim 4 wherein said processormeans comprises a digital signal processor.
 6. The audio systemaccording to claim 5 wherein said digital signal processor comprises areal time adaptive analyzer.
 7. A networked audio system according toclaim 4 wherein said processor means comprises a speaker controller forproviding intelligence to operate application protocol.
 8. A networkedaudio system according to claim 7 wherein said processor means furthercomprises a format converter for converting a digital audio format inputinto an analog output.
 9. A networked audio system according to claim 8wherein said processor means further includes an amplifier and apre-amplifier with an active crossover for limiting frequency output tosaid amplifier.
 10. A networked audio system according to claim 8wherein said speaker controller and format converter can be updated oversaid network with increased processing power and/or memory toaccommodate future digital formats.
 11. A networked audio systemaccording to claim 5 wherein said processor means of each speakercomprises speaker sensor means providing information feedback to othersaid speakers.
 12. A networked audio system according to claim 8 whereinsaid speaker controller is an embedded controller with flash memoryprogrammed to function as a web server and together with said formatconverter are enabled to allow their application programming to beupdated over said network and to thereby accommodate future digitalformats.
 13. A networked audio system according to claim 1 wherein saidnetwork comprises a wired section and one or more wireless segments, andone or more of said speaker nodes comprises interface means providing acommunication bridge between said wire section and said one or morewireless segments.
 14. A networked audio system according to claim 13wherein said interface means comprises an RF receiver/transmittercapable of both receiving and sending RF network communication.
 15. Anetworked audio system according to claim 1 wherein said speaker nodesare adapted to participate in multi speaker sessions to establish amaster speaker node and slave speaker nodes which would receive clockingcontrol information from said master speaker node.
 16. A networked audiosystem according to claim 1 wherein said control and converter means isadapted to convert analog signals from said legacy sources into adesired digital format for distribution to all of said speaker nodes.17. A networked audio system according to claim 1 wherein said controland converter means is adapted to convert digital signals from saidlegacy sources into a desired re-formatted digital format fordistribution to all of said speaker nodes.
 18. A networked audio systemaccording to claim 17 wherein said control and converter means isfurther adapted to convert analog signals from said legacy source intosaid desired digital format for distribution to all of said speakernodes.
 19. A networked audio system according to claim 18 whereincontrol and converter means comprises multiple encoders to providesimultaneous conversion of signals from said legacy sources into saiddesired digital format.